AITopics | safety system

Collaborating Authors

safety system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance

Wang, Xixi, Costa, Miguel, Kovaceva, Jordanka, Wang, Shuai, Pereira, Francisco C.

arXiv.org Artificial IntelligenceNov-25-2025

Large language models (LLMs) have shown promise in table Question Answering (Table QA). However, extending these capabilities to multi-table QA remains challenging due to unreliable schema linking across complex tables. Existing methods based on semantic similarity work well only on simplified hand-crafted datasets and struggle to handle complex, real-world scenarios with numerous and diverse columns. To address this, we propose a graph-based framework that leverages human-curated relational knowledge to explicitly encode schema links and join paths. Given a natural language query, our method searches on graph to construct interpretable reasoning chains, aided by pruning and sub-path merging strategies to enhance efficiency and coherence. Experiments on both standard benchmarks and a realistic, large-scale dataset demonstrate the effectiveness of our approach. To our knowledge, this is the first multi-table QA system applied to truly complex industrial tabular data.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.findings-emnlp.311

2506.04427

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Safe-ROS: An Architecture for Autonomous Robots in Safety-Critical Domains

Benjumea, Diana C., Farrell, Marie, Dennis, Louise A.

arXiv.org Artificial IntelligenceNov-19-2025

Deploying autonomous robots in safety-critical domains requires architectures that ensure operational effectiveness and safety compliance. In this paper, we contribute the Safe-ROS architecture for developing reliable and verifiable autonomous robots in such domains. It features two distinct subsystems: (1) an intelligent control system that is responsible for normal/routine operations, and (2) a Safety System consisting of Safety Instrumented Functions (SIFs) that provide formally verifiable independent oversight. We demonstrate Safe-ROS on an AgileX Scout Mini robot performing autonomous inspection in a nuclear environment. One safety requirement is selected and instantiated as a SIF. To support verification, we implement the SIF as a cognitive agent, programmed to stop the robot whenever it detects that it is too close to an obstacle. We verify that the agent meets the safety requirement and integrate it into the autonomous inspection. This integration is also verified, and the full deployment is validated in a Gazebo simulation, and lab testing. We evaluate this architecture in the context of the UK nuclear sector, where safety and regulation are crucial aspects of deployment. Success criteria include the development of a formal property from the safety requirement, implementation, and verification of the SIF, and the integration of the SIF into the operational robotic autonomous system. Our results demonstrate that the Safe-ROS architecture can provide safety verifiable oversight while deploying autonomous robots in safety-critical domains, offering a robust framework that can be extended to additional requirements and various applications.

artificial intelligence, requirement, verification, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.436.6

2511.14433

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.86)

Industry:

Government (1.00)
Energy > Power Industry > Utilities > Nuclear (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Terekhov, Mikhail, Panfilov, Alexander, Dzenhaliou, Daniil, Gulcehre, Caglar, Andriushchenko, Maksym, Prabhu, Ameya, Geiping, Jonas

arXiv.org Artificial IntelligenceOct-13-2025

AI control protocols serve as a defense mechanism to stop untrusted LLM agents from causing harm in autonomous settings. Prior work treats this as a security problem, stress testing with exploits that use the deployment context to subtly complete harmful side tasks, such as backdoor insertion. In practice, most AI control protocols are fundamentally based on LLM monitors, which can become a central point of failure. We study adaptive attacks by an untrusted model that knows the protocol and the monitor model, which is plausible if the untrusted model was trained with a later knowledge cutoff or can search for this information autonomously. We instantiate a simple adaptive attack vector by which the attacker embeds publicly known or zero-shot prompt injections in the model outputs. Using this tactic, frontier models consistently evade diverse monitors and complete malicious tasks on two main AI control benchmarks. The attack works universally against current protocols that rely on a monitor. Furthermore, the recent Defer-to-Resample protocol even backfires, as its resampling amplifies the prompt injection and effectively reframes it as a best-of-$n$ attack. In general, adaptive attacks on monitor models represent a major blind spot in current control protocols and should become a standard component of evaluations for future AI control mechanisms.

large language model, machine learning, side task, (22 more...)

arXiv.org Artificial Intelligence

2510.09462

Country: Asia (0.46)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Is GPT-4o mini Blinded by its Own Safety Filters? Exposing the Multimodal-to-Unimodal Bottleneck in Hate Speech Detection

Selvanayagam, Niruthiha, Kurti, Ted

arXiv.org Artificial IntelligenceSep-18-2025

As Large Multimodal Models (LMMs) become integral to daily digital life, understanding their safety architectures is a critical problem for AI Alignment. This paper presents a systematic analysis of OpenAI's GPT-4o mini, a globally deployed model, on the difficult task of multimodal hate speech detection. Using the Hateful Memes Challenge dataset, we conduct a multi-phase investigation on 500 samples to probe the model's reasoning and failure modes. Our central finding is the experimental identification of a "Unimodal Bottleneck," an architectural flaw where the model's advanced multimodal reasoning is systematically preempted by context-blind safety filters. A quantitative validation of 144 content policy refusals reveals that these overrides are triggered in equal measure by unimodal visual 50% and textual 50% content. We further demonstrate that this safety system is brittle, blocking not only high-risk imagery but also benign, common meme formats, leading to predictable false positives. These findings expose a fundamental tension between capability and safety in state-of-the-art LMMs, highlighting the need for more integrated, context-aware alignment strategies to ensure AI systems can be deployed both safely and effectively.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.13608

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Virtual Fencing Framework for Safe and Efficient Collaborative Robotics

Badguna, Vineela Reddy Pippera, Arab, Aliasghar, Kodavalla, Durga Avinash

arXiv.org Artificial IntelligenceMar-26-2025

-- Collaborative robots (cobots) increasingly operate alongside humans, demanding robust real-time safeguarding. Current safety standards (e.g., ISO 10218, ANSI/RIA 15.06, ISO/TS 15066) require risk assessments but offer limited guidance for real-time responses. We propose a virtual fencing approach that detects and predicts human motion, ensuring safe cobot operation. Safety and performance tradeoffs are modeled as an optimization problem and solved via sequential quadratic programming. Experimental validation shows that our method minimizes operational pauses while maintaining safety, providing a modular solution for human-robot collaboration. I. INTRODUCTION Cobots, short for collaborative robots, have gained significant traction in various fields, such as manufacturing, assembly, service, education, and healthcare, due to their ability to seamlessly interact with humans while ensuring their physical and mental well-being [1]-[3].

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

2503.20237

Country: North America > United States > New York > Kings County > New York City (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

An indicator for effectiveness of text-to-image guardrails utilizing the Single-Turn Crescendo Attack (STCA)

Kwartler, Ted, Bagan, Nataliia, Banny, Ivan, Aqrawi, Alan, Abbasi, Arian

arXiv.org Artificial IntelligenceNov-27-2024

The Single-Turn Crescendo Attack (STCA), first introduced in Aqrawi and Abbasi [2024], is an innovative method designed to bypass the ethical safeguards of text-to-text AI models, compelling them to generate harmful content. This technique leverages a strategic escalation of context within a single prompt, combined with trust-building mechanisms, to subtly deceive the model into producing unintended outputs. Extending the application of STCA to text-to-image models, we demonstrate its efficacy by compromising the guardrails of a widely-used model, DALL-E 3, achieving outputs comparable to outputs from the uncensored model Flux Schnell, which served as a baseline control. This study provides a framework for researchers to rigorously evaluate the robustness of guardrails in text-to-image models and benchmark their resilience against adversarial attacks.

guardrail, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.18699

Country: Europe > Netherlands > South Holland > Leiden (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine (0.95)
Information Technology > Security & Privacy (0.88)
Law (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.53)

Add feedback

A theory of understanding for artificial intelligence: composability, catalysts, and learning

Zhang, Zijian, Aronowitz, Sara, Aspuru-Guzik, Alán

arXiv.org Artificial IntelligenceAug-15-2024

Understanding is a crucial yet elusive concept in artificial intelligence (AI). This work proposes a framework for analyzing understanding based on the notion of composability. Given any subject (e.g., a person or an AI), we suggest characterizing its understanding of an object in terms of its ability to process (compose) relevant inputs into satisfactory outputs from the perspective of a verifier. This highly universal framework can readily apply to non-human subjects, such as AIs, non-human animals, and institutions. Further, we propose methods for analyzing the inputs that enhance output quality in compositions, which we call catalysts. We show how the structure of a subject can be revealed by analyzing its components that act as catalysts and argue that a subject's learning ability can be regarded as its ability to compose inputs into its inner catalysts. Finally we examine the importance of learning ability for AIs to attain general intelligence. Our analysis indicates that models capable of generating outputs that can function as their own catalysts, such as language models, establish a foundation for potentially overcoming existing limitations in AI understanding.

catalyst, knowledge, verifier, (15 more...)

arXiv.org Artificial Intelligence

2408.08463

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Materials > Chemicals > Specialty Chemicals (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.88)
(2 more...)

Add feedback

Predictive Braking on a Nonplanar Road

Fork, Thomas, Camozzi, Francesco, Fu, Xiao-Yu, Borrelli, Francesco

arXiv.org Artificial IntelligenceJun-3-2024

We present an approach for predictive braking of a four-wheeled vehicle on a nonplanar road. Our main contribution is a methodology to consider friction and road contact safety on general smooth road geometry. We use this to develop an active safety system to preemptively reduce vehicle speed for upcoming road geometry, such as off-camber turns. Our system may be used for human-driven or autonomous vehicles and we demonstrate it with a simulated ADAS scenario. We show that loss of control due to driver error on nonplanar roads can be mitigated by our approach.

constraint, safety system, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2406.01724

Country:

North America > United States > California > Santa Clara County > Sunnyvale (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.40)

Industry: Automobiles & Trucks (0.71)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.67)

Add feedback

Robot Safety Monitoring using Programmable Light Curtains

Ram, Karnik, Aggarwal, Shobhit, Tamburo, Robert, Ancha, Siddharth, Narasimhan, Srinivasa

arXiv.org Artificial IntelligenceApr-4-2024

As factories continue to evolve into collaborative spaces with multiple robots working together with human supervisors in the loop, ensuring safety for all actors involved becomes critical. Currently, laser-based light curtain sensors are widely used in factories for safety monitoring. While these conventional safety sensors meet high accuracy standards, they are difficult to reconfigure and can only monitor a fixed user-defined region of space. Furthermore, they are typically expensive. Instead, we leverage a controllable depth sensor, programmable light curtains (PLC), to develop an inexpensive and flexible real-time safety monitoring system for collaborative robot workspaces. Our system projects virtual dynamic safety envelopes that tightly envelop the moving robot at all times and detect any objects that intrude the envelope. Furthermore, we develop an instrumentation algorithm that optimally places (multiple) PLCs in a workspace to maximize the visibility coverage of robots. Our work enables fence-less human-robot collaboration, while scaling to monitor multiple robots with few sensors. We analyze our system in a real manufacturing testbed with four robot arms and demonstrate its capabilities as a fast, accurate, and inexpensive safety monitoring solution.

light curtain, plc, robot, (16 more...)

arXiv.org Artificial Intelligence

2404.03556

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > South Korea > Daegu > Daegu (0.04)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.90)

Add feedback

AirTouch: Towards Safe Human-Robot Interaction Using Air Pressure Feedback and IR Mocap System

Rakhmatulin, Viktor, Grankin, Denis, Konenkov, Mikhail, Davidenko, Sergei, Trinitatova, Daria, Sautenkov, Oleg, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceJul-31-2023

The growing use of robots in urban environments has raised concerns about potential safety hazards, especially in public spaces where humans and robots may interact. In this paper, we present a system for safe human-robot interaction that combines an infrared (IR) camera with a wearable marker and airflow potential field. IR cameras enable real-time detection and tracking of humans in challenging environments, while controlled airflow creates a physical barrier that guides humans away from dangerous proximity to robots without the need for wearable devices. A preliminary experiment was conducted to measure the accuracy of the perception of safety barriers rendered by controlled air pressure. In a second experiment, we evaluated our approach in an imitation scenario of an interaction between an inattentive person and an autonomous robotic system. Experimental results show that the proposed system significantly improves a participant's ability to maintain a safe distance from the operating robot compared to trials without the system.

artificial intelligence, participant, robot, (13 more...)

arXiv.org Artificial Intelligence

2308.00096

Country:

North America > Costa Rica > Heredia Province > Heredia (0.04)
Europe > Switzerland > Geneva > Geneva (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Hardware (0.35)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.64)

Add feedback